Code Switching and Mixed Language Genesis in Tiwi
نویسندگان
چکیده
منابع مشابه
Mixed Language and Code-Switching in the Canadian Hansard
While there has been lots of interest in code-switching in informal text such as tweets and online content, we ask whether code-switching occurs in the proceedings of multilingual institutions. We focus on the Canadian Hansard, and automatically detect mixed language segments based on simple corpus-based rules and an existing word-level language tagger. Manual evaluation shows that the performa...
متن کاملCode-switching and language change in Tunisia
This article quantitatively studies the patterns of Tunisian Arabic/French codeswitching and the possible implications for contact-induced change in the Tunisian dialect. The purpose is to account for the extent of the occurrence of code-switching across gender lines and levels of education and assess its role in the interference from French into Arabic, both at the lexical and structural level...
متن کاملLanguage Code Switching in Web Corpora
One of the challenges in building and using web corpora is their rather high content of “noise”, most notably having the form of foreign-language text fragments within otherwise monolingual text. Our paper presents an approach trying to cope with this problem by means of “exhaustive” stop-word lists provided by morphosyntactic taggers. As a side effect of the procedure, a problem of tagging tex...
متن کاملLanguage Identification in Code-Switching Scenario
This paper describes a CRF based token level language identification system entry to Language Identification in CodeSwitched (CS) Data task of CodeSwitch 2014. Our system hinges on using conditional posterior probabilities for the individual codes (words) in code-switched data to solve the language identification task. We also experiment with other linguistically motivated language specific as ...
متن کاملCode-Switching Ubique Est - Language Identification and Part-of-Speech Tagging for Historical Mixed Text
In this paper, we describe the development of a language identification system and a part-of-speech tagger for Latin-Middle English mixed text. To this end, we annotate data with language IDs and Universal POS tags (Petrov et al., 2012). As a classifier, we train a conditional random field classifier for both sub-tasks, including features generated by the TreeTagger models of both languages. Th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Annual Meeting of the Berkeley Linguistics Society
سال: 2012
ISSN: 2377-1666,0363-2946
DOI: 10.3765/bls.v38i0.3346